Code
family_pdf(NO, mu=0, sigma=c(0.5, 1,1.5), from=-8, to=8, title="Normal") -> p1
family_pdf(LO, mu=0, sigma=c(0.5, 1,1.5), from=-8, to=8, title="Logistic") -> p2
gridExtra::grid.arrange(p1, p2, nrow=1)Lecture 2
NHMRC Clinical Trials Centre, University of Sydney
gillian.heller@sydney.edu.au
\begin{align*} y_{i}&\sim \mathcal{D}\left(\mu_i,\sigma_i,\nu_i,\tau_i\right) \\ g_1(\mu_i)&=\beta_0^{\mu}+s_1^{\mu}(x_{i1})+\ldots+s_p^{\mu}( x_{ip})\\ g_2(\sigma_i)&=\beta_0^{\sigma}+s_1^{\sigma}(x_{i1})+\ldots+s_p^{\sigma}( x_{ip})\\ g_3(\nu_i)&=\beta_0^{\nu}+s_1^{\nu}(x_{i1})+\ldots+s_p^{\nu}( x_{ip})\\ g_4(\tau_i)&=\beta_0^{\tau}+s_1^{\tau}(x_{i1})+\ldots+s_p^{\tau}( x_{ip}) \end{align*} OR
\begin{align*} y_{i}&\sim \mathcal{D}\left(\theta_{i1},\ldots,\theta_{iK}\right) \\ g_k(\theta_{ik})&=\beta_0^{k}+s_1^{k}(x_{i1})+\ldots+s_p^{k}( x_{ip})\quad\text{for }k=1,\ldots,K \end{align*}
\begin{align*} y_{i}&\sim \textcolor{red}{\mathcal{D}\left(\theta_{i1},\ldots,\theta_{iK}\right)} \\ g_k(\theta_{ik})&=\beta_0^{k}+s_1^{k}(x_{i1})+\ldots+s_p^{k}( x_{ip}) \end{align*}
\begin{align*} y_{i}&\sim \textcolor{red}{\mathcal{D}\left(\theta_{i1},\ldots,\theta_{iK}\right)} \\ g_k(\theta_{ik})&=\beta_0^{k}+\textcolor{blue}{s_1^{k}(x_{i1})+\ldots+s_p^{k}( x_{ip})} \end{align*}
Response distribution \textcolor{red}{\mathcal{D}(\theta_1,\ldots,\theta_K)}
Terms \textcolor{blue}{s_j^{k}(x_{ij})}
\begin{align*} y_{i}&\sim \textcolor{red}{\mathcal{D}\left(\theta_{i1},\ldots,\theta_{iK}\right)} \\ \textcolor{green}{g_k}(\theta_{ik})&=\beta_0^{k}+\textcolor{blue}{s_1^{k}(x_{i1})+\ldots+s_p^{k}( x_{ip})} \end{align*}
Response distribution \textcolor{red}{\mathcal{D}(\theta_1,\ldots,\theta_K)}
Terms \textcolor{blue}{s_j^{k}(x_{ij})}
Link functions \textcolor{green}{g_k}(\cdot)
Estimation
Model diagnostics
Parameter interpretation
g_k(\theta_{ik})=\eta_{ik} = \beta_0^{k}+s_1^{k}(x_{i1})+\ldots+s_p^{k}( x_{ip})
\log(\theta_k) = \eta_{ik}
\log\left(\frac{\theta_k}{1-\theta_k}\right) = \eta_{ik}
\textcolor{red}{\text{BUT}}
Choice for a link function also implies a modeling decision.
This decision determines the exact relation between the covariates and the conditional response distribution
It has consequences for both the fit of the model and the interpretation of the estimated regression effects.
The log link implies a multiplicative relationship between model parameter and covariates:
\begin{align*} \log(\theta_j) &= \beta_0^j + \beta_1^jx_1 + \ldots+\beta_p^jx_p\\ \Rightarrow\quad \theta_j &= \exp(\beta_0^j)\cdot\exp(\beta_1^jx_1) \cdots\exp(\beta_p^jx_p) \end{align*}
real line \mathbb{R}: -\infty<y<\infty
positive real line \mathbb{R}_+: 0<y<\infty
bounded interval \mathbb{R}_{(0,1)}: 0<y<1
Normal NO(\mu, \sigma)
Logistic LO(\mu, \sigma) (leptokurtic)
Gumbel GU(\mu, \sigma) (left skewed)
Reverse Gumbel RG(\mu, \sigma) (right skewed)
Normal N(0, \sigma) and Logistic LO(0, \sigma)
Gumbel GU(0, \sigma) and reverse Gumbel RG(0, \sigma)
exponential Gaussian: exGAUS(\mu, \sigma, \nu) for modelling right skew data,
normal family: NOF(\mu, \sigma, \nu) for modelling mean and variance relationships following the power law;
power exponential: PE(\mu, \sigma, \nu) and PE2(\mu, \sigma, \nu) for modelling lepto- and platykurtotic data;
t family: TF(\mu, \sigma, \nu) and TF2(\mu, \sigma, \nu) for modelling leptokurtotic data;
skew normal: SN1(\mu, \sigma, \nu) and SN2(\mu, \sigma, \nu) for modelling skewness
Skew Normal type 1 SN1(0, 1, \nu)
x <- seq(-4,4, length=101)
f1 <- dSN1(x, mu=0, sigma=1, nu=-100)
f2 <- dSN1(x, mu=0, sigma=1, nu=-5)
f3 <- dSN1(x, mu=0, sigma=1, nu=0)
da1 <- data.frame(f=f1, x=x, fam=rep("nu = -100", 101))
da2 <- data.frame(f=f2, x=x, fam=rep("nu = -5",101))
da3 <- data.frame(f=f3, x=x, fam=rep("nu = 0",101))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam, levels=c("nu = -100", "nu = -5", "nu = 0"))
p9 <- ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=1, aes(colour = fam)) +
xlab("y") + ylab("f(y)")
f1 <- dSN1(x, mu=0, sigma=1, nu=100)
f2 <- dSN1(x, mu=0, sigma=1, nu=5)
f3 <- dSN1(x, mu=0, sigma=1, nu=0)
da1 <- data.frame(f=f1, x=x, fam=rep("nu = 100", 101))
da2 <- data.frame(f=f2, x=x, fam=rep("nu = 5",101))
da3 <- data.frame(f=f3, x=x, fam=rep("nu = 0",101))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam, levels=c("nu = 100", "nu = 5", "nu = 0"))
p10 <- ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=1, aes(colour = fam)) +
xlab("y") + ylab("f(y)")
gridExtra::grid.arrange(p9, p10, nrow=1)Power Exponential PE(0,1,\nu)x <- seq(-4,4, length=101)
f1 <- dPE(x, mu=0, sigma=1, nu=1)
f2 <- dPE(x, mu=0, sigma=1, nu=2)
f3 <- dPE(x, mu=0, sigma=1, nu=10)
f4 <- dPE(x, mu=0, sigma=1, nu=100)
da1 <- data.frame(f=f1, x=x, fam=rep("nu=1", 101))
da2 <- data.frame(f=f2, x=x, fam=rep("nu=2",101))
da3 <- data.frame(f=f3, x=x, fam=rep("nu=10",101))
da4 <- data.frame(f=f4, x=x, fam=rep("nu=100",101))
da <- rbind(da1, da2, da3, da4)
da$fam <- factor(da$fam,
levels = c("nu=1","nu=2","nu=10","nu=100"))
p9 <- ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=1, aes(colour = fam))+
xlab("y") + ylab("f(y)")
p9t family TF(0,1,\nu)
x <- seq(-5,5, length=101)
f1 <- dTF(x, mu=0, sigma=1, nu=100)
f2 <- dTF(x, mu=0, sigma=1, nu=2)
f3 <- dTF(x, mu=0, sigma=1, nu=1)
da1 <- data.frame(f=f1, x=x, fam=rep("nu=100", 101))
da2 <- data.frame(f=f2, x=x, fam=rep("nu=2",101))
da3 <- data.frame(f=f3, x=x, fam=rep("nu=1",101))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam,
levels = c("nu=1","nu=2","nu=100"))
p9 <- ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=1, aes(colour = fam)) +
xlab("y") + ylab("f(y)")
p9exponential generalised beta type 2,
EGB2(\mu,\sigma,\nu,\tau) skewness and leptokurtosis;
generalised t, GT(\mu,\sigma, \nu, \tau) kurtosis;
Johnson's SU, JSU(\mu,\sigma, \nu, \tau) and \texttt{JSUo}(\mu,\sigma, \nu, \tau) skewness and leptokurtosis;
normal-exponential-t \texttt{NET}(\mu,\sigma, \nu, \tau), robustly location and scale
skew exponential power \texttt{SEP1}(\mu,\sigma, \nu, \tau), \texttt{SEP2}(\mu,\sigma, \nu, \tau), \texttt{SEP3}(\mu,\sigma, \nu, \tau) and \texttt{SEP4}(\mu,\sigma, \nu, \tau) skewness and lepto-platy;
sinh-arcsinh, \texttt{SHASH}(\mu,\sigma, \nu, \tau), \texttt{SHASHo}(\mu,\sigma, \nu, \tau) and \texttt{SHASHo2}(\mu,\sigma, \nu, \tau) skewness and lepto-platy;
skew t, \texttt{ST1}(\mu,\sigma, \nu, \tau), \texttt{ST2}(\mu,\sigma, \nu, \tau), \texttt{ST3}(\mu,\sigma, \nu, \tau), \texttt{ST4}(\mu,\sigma, \nu, \tau), \texttt{ST5}(\mu,\sigma, \nu, \tau) and \texttt{SST}(\mu,\sigma, \nu, \tau) skewness and leptokurtosis.
SEP1(\mu=0, \sigma=1, \nu, \tau)
p1 <- family_pdf(SEP1, mu=0, sigma=1, nu=c(0,1,2), tau=1, from=-4, to=4, title="SEP1(0, 1, nu = 0, 1, 2, tau = 1)")
p2 <- family_pdf(SEP1, mu=0, sigma=1, nu=c(0,1,2), tau=2, from=-4, to=4, title="SEP1(0, 1, nu = 0, 1, 2, tau = 2)")
p3 <- family_pdf(SEP1, mu=0, sigma=1, nu=c(0,1,2), tau=5, from=-2.5, to=2.5, title="SEP1(0, 1, nu = 0, 1, 2, tau = 5)")
p4 <- family_pdf(SEP1, mu=0, sigma=1, nu=c(0,-1,-2), tau=100, from=-2, to=2, title="SEP1(0, 1, nu = 0, 1, 2, tau = 100)")
gridExtra::grid.arrange(p1,p2,p3,p4, nrow=2)exponential, \texttt{EXP}(\mu,\sigma)
gamma \texttt{GA}(\mu,\sigma) member of the exponential family;
inverse gamma \texttt{IGAMMA}(\mu,\sigma);
inverse Gaussian, \texttt{IG}(\mu,\sigma) a member of the exponential family;
log-normal \texttt{LOGNO}(\mu,\sigma) and \texttt{LOGNO2}(\mu,\sigma).
Pareto \texttt{PARETO}(\mu,\sigma), \texttt{PARETO2o}(\mu,\sigma) and \texttt{GP}(\mu,\sigma) for heavy tail;
Weibull \texttt{WEI}(\mu,\sigma), \texttt{WEI2}(\mu,\sigma) and \texttt{WEI3}(\mu,\sigma) used in survival analysis.
Weibull(\mu=1, \sigma)
x <- seq(0.001,5, length=101)
f1 <- dWEI3(x, mu=1, sigma=0.5)
f2 <- dWEI3(x, mu=1, sigma=1)
f3 <- dWEI3(x, mu=1, sigma=6)
da1 <- data.frame(f=f1, x=x, fam=rep("sigma = 0.5", 101))
da2 <- data.frame(f=f2, x=x, fam=rep("sigma = 1",101))
da3 <- data.frame(f=f3, x=x, fam=rep("sigma = 6",101))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam,
levels = c("sigma = 0.5","sigma = 1","sigma = 6"))
ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=0.7, aes(colour = fam)) +
ylim(c(0, 3)) +
xlab("y") + ylab("f(y)") +
theme_bw() +
theme(legend.title=element_blank(),
axis.text.x=element_text(size=16),
axis.text.y=element_text(size=16),
axis.title.x = element_text(size=16),
axis.title.y = element_text(size=16),
legend.text = element_text(size=16))Weibull(\mu=2, \sigma)
x <- seq(0.001,5, length=101)
f1 <- dWEI3(x, mu=2, sigma=0.5)
f2 <- dWEI3(x, mu=2, sigma=1)
f3 <- dWEI3(x, mu=2, sigma=6)
da1 <- data.frame(f=f1, x=x, fam=rep("sigma = 0.5", 101))
da2 <- data.frame(f=f2, x=x, fam=rep("sigma = 1",101))
da3 <- data.frame(f=f3, x=x, fam=rep("sigma = 6",101))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam,
levels = c("sigma = 0.5","sigma = 1","sigma = 6"))
ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=0.7, aes(colour = fam)) +
ylim(c(0, 3)) +
xlab("y") + ylab("f(y)") +
theme_bw() +
theme(legend.title=element_blank(),
axis.text.x=element_text(size=16),
axis.text.y=element_text(size=16),
axis.title.x = element_text(size=16),
axis.title.y = element_text(size=16),
legend.text = element_text(size=16))Box-Cox Cole and Green, \texttt{BCCG}(\mu, \sigma, \nu) known also as the LMS method in centile estimation.
generalised gamma, \texttt{GG}(\mu, \sigma, \nu)
generalised inverse Gaussian, \texttt{GIG}(\mu, \sigma, \nu)
log-normal family, \texttt{LNO}(\mu, \sigma, \nu) based on the standard Box-Cox transformation
BCCG(\mu=1, \sigma=0.4, \nu)
x <- seq(0.001,3, length=501)
f1 <- dBCCG(x, mu=1, sigma=0.1, nu=-1)
f2 <- dBCCG(x, mu=1, sigma=0.3, nu=-1)
f3 <- dBCCG(x, mu=1, sigma=0.5, nu=-1)
da1 <- data.frame(f=f1, x=x, fam=rep("nu=100", 501))
da2 <- data.frame(f=f2, x=x, fam=rep("nu=2",501))
da3 <- data.frame(f=f3, x=x, fam=rep("nu=1",501))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam,
levels = c("nu=1","nu=2","nu=100"))
ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=0.5, aes(colour = fam)) +
xlab("y") + ylab("f(y)") +
theme_bw() +
theme(legend.title=element_blank(),
axis.text.x=element_text(size=12),
axis.text.y=element_text(size=12),
axis.title.x = element_text(size=12),
axis.title.y = element_text(size=12),
legend.text = element_text(size=12))Box-Cox power exponential, \texttt{BCPE}(\mu,\sigma,\nu,\tau) and \texttt{BCPEo}(\mu,\sigma,\nu,\tau) skewness and platy- and leptokurtosis;
Box-Cox t, \texttt{BCT}(\mu,\sigma,\nu,\tau) and \texttt{BCTo}(\mu,\sigma,\nu,\tau) skewness and leptokurtosis;
generalised beta type 2, \texttt{GB2}(\mu,\sigma,\nu,\tau) skewness and platy- and leptokurtosis
BCT (\mu=1, \sigma=0.2, \nu, \tau=10)
x <- seq(0.001,3, length=501)
f1 <- dBCT(x, mu=1, sigma=0.2, nu=-1, tau=10)
f2 <- dBCT(x, mu=1, sigma=0.2, nu=1, tau=10)
f3 <- dBCT(x, mu=1, sigma=0.2, nu=2, tau=10)
da1 <- data.frame(f=f1, x=x, fam=rep("nu = -1", 501))
da2 <- data.frame(f=f2, x=x, fam=rep("nu = 1",501))
da3 <- data.frame(f=f3, x=x, fam=rep("nu = 2",501))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam,
levels = c("nu = -1","nu = 1","nu = 2"))
ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=0.5, aes(colour = fam)) +
xlab("y") + ylab("f(y)") +
theme_bw() +
theme(legend.title=element_blank(),
axis.text.x=element_text(size=16),
axis.text.y=element_text(size=16),
axis.title.x = element_text(size=16),
axis.title.y = element_text(size=16),
legend.text = element_text(size=16))BCPE (\mu=1, \sigma=0.2, \nu=2, \tau)
x <- seq(0.001,2.5, length=501)
f1 <- dBCPE(x, mu=1, sigma=0.2, nu=2, tau=1)
f2 <- dBCPE(x, mu=1, sigma=0.2, nu=2, tau=2)
f3 <- dBCPE(x, mu=1, sigma=0.2, nu=2, tau=10)
da1 <- data.frame(f=f1, x=x, fam=rep("tau = 1", 501))
da2 <- data.frame(f=f2, x=x, fam=rep("tau = 2",501))
da3 <- data.frame(f=f3, x=x, fam=rep("tau = 10",501))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam,
levels = c("tau = 1","tau = 2","tau = 10"))
ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=0.5, aes(colour = fam)) +
xlab("y") + ylab("f(y)") +
theme_bw() +
theme(legend.title=element_blank(),
axis.text.x=element_text(size=16),
axis.text.y=element_text(size=16),
axis.title.x = element_text(size=16),
axis.title.y = element_text(size=16),
legend.text = element_text(size=16))| Distribution | gamlss family | no par. |
|---|---|---|
| beta | BE |
2 |
| beta original | BEo |
2 |
| generalized beta type 1 | GB1 |
4 |
| logit normal | LOGITNO |
2 |
| simplex | SIMPLEX |
2 |
Beta(\mu=0.5,\sigma)
x <- seq(0.001,0.999, length=101)
f1 <- dBE(x, mu=0.5, sigma=0.2)
f2 <- dBE(x, mu=0.5, sigma=0.5)
f3 <- dBE(x, mu=0.5, sigma=0.7)
da1 <- data.frame(f=f1, x=x, fam=rep("sigma = 0.2", 101))
da2 <- data.frame(f=f2, x=x, fam=rep("sigma = 0.5",101))
da3 <- data.frame(f=f3, x=x, fam=rep("sigma = 0.7",101))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam,
levels = c("sigma = 0.2","sigma = 0.5","sigma = 0.7"))
ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=0.5, aes(colour = fam)) +
xlab("y") + ylab("f(y)") +
theme_bw() +
theme(legend.title=element_blank(),
axis.text.x=element_text(size=16),
axis.text.y=element_text(size=16),
axis.title.x = element_text(size=16),
axis.title.y = element_text(size=16),
legend.text = element_text(size=16))Simplex(\mu=0.5,\sigma)
x <- seq(0.001,0.999, length=101)
f1 <- dSIMPLEX(x, mu=0.5, sigma=0.5)
f2 <- dSIMPLEX(x, mu=0.5, sigma=1)
f3 <- dSIMPLEX(x, mu=0.5, sigma=2)
da1 <- data.frame(f=f1, x=x, fam=rep("sigma = 0.5", 101))
da2 <- data.frame(f=f2, x=x, fam=rep("sigma = 1",101))
da3 <- data.frame(f=f3, x=x, fam=rep("sigma = 2",101))
da <- rbind(da1, da2, da3)
da$fam <- factor(da$fam,
levels = c("sigma = 0.5","sigma = 1","sigma = 2"))
ggplot(data=da, aes(x = x, y=f, group=fam)) +
ggplot2::geom_line(lwd=0.5, aes(colour = fam)) +
xlab("y") + ylab("f(y)") +
theme_bw() +
theme(legend.title=element_blank(),
axis.text.x=element_text(size=16),
axis.text.y=element_text(size=16),
axis.title.x = element_text(size=16),
axis.title.y = element_text(size=16),
legend.text = element_text(size=16))Lecture 2